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Considering Cloud Native 


e Elasticity with increased efficiency 
e Flexible and configurable Availability 
e Cost Reduction 
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* https://www.redhat.com/en/topics/cloud-native-apps 
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Instance sizing 


ө Per role type: Masters, RegionServers 

ө Per load (with RS Grouping) 
O System tables (meta, namespace, phoenix) 
O Use case specific load 
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TCO 


Metrics based automatic scale up/down 


ө Read/Write latency/throughput based 

ө RPC latency based 

Ф Compaction queue size 

€ Overall request load 

ө Region density 

€ Available cache space (Cloud Storage deployments) 
CLOUD=RA 
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TCO 


Storage Optimizations 


THE ASF CONFERENCE 


ө Block vs Cloud Storage 
ө Ephemeral Storage for BucketCache 


Cloud Storage 


Compute Instance 
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Storage Options 
Cloud Storage 


e Benefits 
O Cost efficient for large volumes 
O Available on major cloud providers 
© Decoupled compute from storage 


e Challenges 
о Increased latency 
o HBase native compatibility 
m S3lacks atomic rename (for now) 
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Storage Options 
Cloud Storage 
e Mitigating latency 
O File based bucket cache over ephemeral storage 
B Initial cache warmup required 
m Cache must be resilient to RS crashes/restarts 
© HBASE-27686, HBASE-27743 
ш Region balancer must consider impacts to the cache 
ө HBASE-27389 
o Cloud provider specific tunings 
ш Throttling, connection/thread pooling configs 
ш Request hints (random/sequential) 
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Storage Options 
Cloud Storage 


e HBase 53 Integration 
о HBase originally designed for hierarchical file systems 


B HBase internal write operations аге two phased: 
e New files created on a temp dir 
e Rename new files to final dir at commit time 


m Amazon S3 lacks atomic renames 
e Requires locking the whole dir content 
e Individual files rename calls to S3 (suboptimal) 
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Cloud Storage 
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Original Flow Samie) Co 


- flushCache 
createWriter 
createWriter 
reate temp writer 
commitStoreFiles 
rename files 


addStoreFiles 
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Storage Options COMMUNIT 


Cloud Storage 
e HBase 53 Integration 
o Redesign HBase to not (only) rely on renames 
ш Store File Tracking: HBASE-26067 
e Additional layer for tracking store files 
e Write operations delegate file path decision to the 
tracking layer 
e Configurable tracker implementations 
o Default: still uses renames 
o File Based Tracker: keeps list of valid files in 
meta files 
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Storage Options 
Cloud Storage 


SFT Flow 


-flushCache 


createWriter 


-— commitStoreFiles 


addStoreFiles 


isTmpWriter 
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Storage Options 
Cloud Storage 


SFT Flow 
(detailed) 


addStoreFiles 


doAddNewsStoreFiles 
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find new prefix 


create new meta file 


write to new meta file 


delete previous meta file 
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Storage Options 
Cloud Storage 
e Limitations 
© No durable syncs (fsync/hflush) 
m WAL files require durable file sync to overcome crashes 
B Low latency critical for write performance 


e Deploy HDFS for WAL 
o Separate WAL from data directory 
о WALs are temporary, so limited space usage 
o Usethe instances attached disks for HDFS data 
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Storage Options 
Block Storage 


e Benefits 
O Higher performance * 


e Challenges 
о More costly 
O Compute/Storage tightly coupled 


CLOUD=RA Ж © 2023 Cloudera, Inc. All rights reserved. 15 


COMMUN!T 


THE ASF CONFERENCE 


Storage Options 
Block Storage 


e Block storage volumes attached to the cluster nodes 
e HDFS deployed over the nodes 
o Both WALs and HBase data on HDFS 
e Less cost efficient 
B Require additional/higher volumes attached to instances 
B HDFS replication factor requires more storage 
B Compute and Storage scaling together 


e Higher Performance 
O Avg 5x better latency/throughput 
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Storage Options 
Cloud vs Block 


Cloud Storage More cost efficient Lower performance 
Separate compute from storage: e Mitigated with bucket cache 
e Independent scaling 
e Data management and availability 


Block Storage Better overall performance Higher costs 
Couples compute and storage 
e Can't scale independently 


CLOUD=RA Ж © 2023 Cloudera, Inc. All rights reserved. 17 


Security Simplification COMMUNIT 


THE ASF CONFERENCE 


Kerberos vs JWT 
e Kerberos 

O НВаве support since early versions 
O Strong client/server authentication (KDC based) 
O SASL for the RPC authentication 
O Requires clients to be known by the KDC 
O Complex to integrate with other services 
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Security Simplification COMMUNIT 
Kerberos vs JWT 
e JWI 
O Token based authentication 
B Custom SASL auth provider (HBASE-23347) 
ш HBase builtin works ongoing (HBASE-26553) 
O TLS for RPC (encrypts the token and further info) 


B HBASE-26666 recently added TLS for RPC support 
O Centralised authentication service 


B Easily reusable by clients/other services 
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Availability and Fault Tolerance 
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e HBase built-in 


capabilities 

o Master HA » EE 

o Table sharding " ЕЕРЕЕ 
(Кедіопѕ) | 


o Meta region replica 
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THE ASF CONFERENCE 


e * Multiple 
Availability Zones 
(Multi AZ) 

о Master 
instances on 
different zones 

o RSes spread 
evenly 
throughout 
zones 

o Increases 
latency 


*https://blog.cloudera.com/high-availability-multi-az-for-cdp-operational-database/ © 2023 Cloudera, Inc. All rights reserved. 21 
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Availability and Fault Tolerance (cont.) 


e Limitations 
о Cache reload (Cloud Storage deployments) when RSs 
crashes 
m HBASE-27313 Persist list of HFiles after prefetch 
m HBASE-27/743 Enhancement for persistent cache 
m HBASE-27389 Cost function that considers bucket 
cache as part of the plan before region movement 
o Single physical region unavailability 
ш Cross-region cluster replication for better RTO 
experience 
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Benchmark COMMUNIT 


YCSB Workloads 


Dataset size: 20 Billion rows = ~20TB 

Workload C: 100% Read 

Workload A: 50% read, 50% write 

Workload F: 50% read, 25% update, 25% read-modify-update 
Two common deployments, HDFS vs Cloud storage 
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Benchmark 
Clusters definitions 


e AWS 
о HDFS 
m 20RSes m5.2xlarge storage with HDD 
m 2 Masters m5.8xlarge 
o S3 
m 20 RSes i3.2xlarge storage as S3 
m 2 Masters m5.2xlarge 


o HDFS 
m 20RSes Standard D8. V3 
m 2 Masters Standard D32 V3 


m 20 RSes Standard L8s V2 
m 2 Masters Standard D8a VA 
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AWS with HDFS vs S3 (Higher is better) 
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AWS-HBase-Throughput 
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Benchmark 
AWS with HDFS vs S3 (Lower is better) 


AWS-HBase-Read Latency 
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AWS-HBase-Write Latency 


There is an outlier 
on P95+, where the 
P95 latency is 28.3 


WL A Avg Latency WL F Avg Latency 


Workloads 
Ш HDFS Ш 53 
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Benchmark 
Azure with HDFS vs ABFS (Higher is better) 
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Azure-HBase-Throughput 
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Benchmark 
Azure with HDFS vs ABFS (Lo 
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Azure-HBase-Write Latency 
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Benchmark 
Average comparison against HDFS on Block Storage 


Workload Type Latency S3 vs Throughput S3 vs | Latency ABFS vs Throughput 
HDFS HDFS HDFS 53 vs HDFS 
Read only workload 47% lower 86% higher 48% lower 91% higher 
6 


50% Read, 50% Write R: 52% lower 36% higher R: 46% lower 6% higher 
W: 28% higher W: 35% lower 

50% read, 25% update, 25% R: 87% lower 66% higher R: 48% lower 66% lower 

read-modify-update W: 29% lower W: 35% lower 
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Sample comparison for a 50TB read/write workload on AWS: 


S3 with ephemeral HDFS in block storage 
^a 


3 x m5.2xlarge 3 x m5.8xlarge 
40 x i3.2xlarge 40 x m5.2xlarge 


$10,778.58 $7,467.51 
%%1,428.60 $9,216.00 
$12,207.18 (-26.8%) $16,683.51 


CLOUD=RA “*weestimated the amount of list/put/get requests to S3 is 50 millions per month, the actual cost may be slightly Ж © 2023 Cloudera, Inc. Al rights reserved. 30 
higher 


64TB ephemeral storage for | 190TB EBS volumes for 
bucket cache HDFS 
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e Wellington Chevreuil 


e Stephen Wu 
o SW Engineer at Cloudera, HBase PMC 


e Specials thanks to OpDB Team @ 
Cloudera for all the contributions 
o Andor, Ankit, Sergey, Surbat, Peter, 
Shanmukha, Rahul and more people 
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